74 research outputs found

    Concept-Based Visual Analysis of Dynamic Textual Data

    Full text link
    Analyzing how interrelated ideas flow within and between multiple social groups helps understand the propagation of information, ideas, and thoughts on social media. The existing dynamic text analysis work on idea flow analysis is mostly based on the topic model. Therefore, when analyzing the reasons behind the flow of ideas, people have to check the textual data of the ideas, which is annoying because of the huge amount and complex structures of these texts. To solve this problem, we propose a concept-based dynamic visual text analytics method, which illustrates how the content of the ideas change and helps users analyze the root cause of the idea flow. We use concepts to summarize the content of the ideas and show the flow of concepts with the flow lines. To ensure the stability of the flow lines, a constrained t-SNE projection algorithm is used to display the change of concepts over time and the correlation between them. In order to better convey the anomalous change of the concepts, we propose a method to detect the time periods with anomalous change of concepts based on anomaly detection and highlight them. A qualitative evaluation and a case study on real-world Twitter datasets demonstrate the correctness and effectiveness of our visual analytics method.Comment: in Chinese languag

    From Capture to Display: A Survey on Volumetric Video

    Full text link
    Volumetric video, which offers immersive viewing experiences, is gaining increasing prominence. With its six degrees of freedom, it provides viewers with greater immersion and interactivity compared to traditional videos. Despite their potential, volumetric video services poses significant challenges. This survey conducts a comprehensive review of the existing literature on volumetric video. We firstly provide a general framework of volumetric video services, followed by a discussion on prerequisites for volumetric video, encompassing representations, open datasets, and quality assessment metrics. Then we delve into the current methodologies for each stage of the volumetric video service pipeline, detailing capturing, compression, transmission, rendering, and display techniques. Lastly, we explore various applications enabled by this pioneering technology and we present an array of research challenges and opportunities in the domain of volumetric video services. This survey aspires to provide a holistic understanding of this burgeoning field and shed light on potential future research trajectories, aiming to bring the vision of volumetric video to fruition.Comment: Submitte

    Deep Learning for Edge Computing Applications: A State-of-the-Art Survey

    Get PDF
    With the booming development of Internet-of-Things (IoT) and communication technologies such as 5G, our future world is envisioned as an interconnected entity where billions of devices will provide uninterrupted service to our daily lives and the industry. Meanwhile, these devices will generate massive amounts of valuable data at the network edge, calling for not only instant data processing but also intelligent data analysis in order to fully unleash the potential of the edge big data. Both the traditional cloud computing and on-device computing cannot sufficiently address this problem due to the high latency and the limited computation capacity, respectively. Fortunately, the emerging edge computing sheds a light on the issue by pushing the data processing from the remote network core to the local network edge, remarkably reducing the latency and improving the efficiency. Besides, the recent breakthroughs in deep learning have greatly facilitated the data processing capacity, enabling a thrilling development of novel applications, such as video surveillance and autonomous driving. The convergence of edge computing and deep learning is believed to bring new possibilities to both interdisciplinary researches and industrial applications. In this article, we provide a comprehensive survey of the latest efforts on the deep-learning-enabled edge computing applications and particularly offer insights on how to leverage the deep learning advances to facilitate edge applications from four domains, i.e., smart multimedia, smart transportation, smart city, and smart industry. We also highlight the key research challenges and promising research directions therein. We believe this survey will inspire more researches and contributions in this promising field

    Unimodal Training-Multimodal Prediction: Cross-modal Federated Learning with Hierarchical Aggregation

    Full text link
    Multimodal learning has seen great success mining data features from multiple modalities with remarkable model performance improvement. Meanwhile, federated learning (FL) addresses the data sharing problem, enabling privacy-preserved collaborative training to provide sufficient precious data. Great potential, therefore, arises with the confluence of them, known as multimodal federated learning. However, limitation lies in the predominant approaches as they often assume that each local dataset records samples from all modalities. In this paper, we aim to bridge this gap by proposing an Unimodal Training - Multimodal Prediction (UTMP) framework under the context of multimodal federated learning. We design HA-Fedformer, a novel transformer-based model that empowers unimodal training with only a unimodal dataset at the client and multimodal testing by aggregating multiple clients' knowledge for better accuracy. The key advantages are twofold. Firstly, to alleviate the impact of data non-IID, we develop an uncertainty-aware aggregation method for the local encoders with layer-wise Markov Chain Monte Carlo sampling. Secondly, to overcome the challenge of unaligned language sequence, we implement a cross-modal decoder aggregation to capture the hidden signal correlation between decoders trained by data from different modalities. Our experiments on popular sentiment analysis benchmarks, CMU-MOSI and CMU-MOSEI, demonstrate that HA-Fedformer significantly outperforms state-of-the-art multimodal models under the UTMP federated learning frameworks, with 15%-20% improvement on most attributes.Comment: 10 pages,5 figure

    Understanding User Behavior in Volumetric Video Watching: Dataset, Analysis and Prediction

    Full text link
    Volumetric video emerges as a new attractive video paradigm in recent years since it provides an immersive and interactive 3D viewing experience with six degree-of-freedom (DoF). Unlike traditional 2D or panoramic videos, volumetric videos require dense point clouds, voxels, meshes, or huge neural models to depict volumetric scenes, which results in a prohibitively high bandwidth burden for video delivery. Users' behavior analysis, especially the viewport and gaze analysis, then plays a significant role in prioritizing the content streaming within users' viewport and degrading the remaining content to maximize user QoE with limited bandwidth. Although understanding user behavior is crucial, to the best of our best knowledge, there are no available 3D volumetric video viewing datasets containing fine-grained user interactivity features, not to mention further analysis and behavior prediction. In this paper, we for the first time release a volumetric video viewing behavior dataset, with a large scale, multiple dimensions, and diverse conditions. We conduct an in-depth analysis to understand user behaviors when viewing volumetric videos. Interesting findings on user viewport, gaze, and motion preference related to different videos and users are revealed. We finally design a transformer-based viewport prediction model that fuses the features of both gaze and motion, which is able to achieve high accuracy at various conditions. Our prediction model is expected to further benefit volumetric video streaming optimization. Our dataset, along with the corresponding visualization tools is accessible at https://cuhksz-inml.github.io/user-behavior-in-vv-watching/Comment: Accepted by ACM MM'2

    LiveVV: Human-Centered Live Volumetric Video Streaming System

    Full text link
    Volumetric video has emerged as a prominent medium within the realm of eXtended Reality (XR) with the advancements in computer graphics and depth capture hardware. Users can fully immersive themselves in volumetric video with the ability to switch their viewport in six degree-of-freedom (DOF), including three rotational dimensions (yaw, pitch, roll) and three translational dimensions (X, Y, Z). Different from traditional 2D videos that are composed of pixel matrices, volumetric videos employ point clouds, meshes, or voxels to represent a volumetric scene, resulting in significantly larger data sizes. While previous works have successfully achieved volumetric video streaming in video-on-demand scenarios, the live streaming of volumetric video remains an unresolved challenge due to the limited network bandwidth and stringent latency constraints. In this paper, we for the first time propose a holistic live volumetric video streaming system, LiveVV, which achieves multi-view capture, scene segmentation \& reuse, adaptive transmission, and rendering. LiveVV contains multiple lightweight volumetric video capture modules that are capable of being deployed without prior preparation. To reduce bandwidth consumption, LiveVV processes static and dynamic volumetric content separately by reusing static data with low disparity and decimating data with low visual saliency. Besides, to deal with network fluctuation, LiveVV integrates a volumetric video adaptive bitrate streaming algorithm (VABR) to enable fluent playback with the maximum quality of experience. Extensive real-world experiment shows that LiveVV can achieve live volumetric video streaming at a frame rate of 24 fps with a latency of less than 350ms

    RePAST: A ReRAM-based PIM Accelerator for Second-order Training of DNN

    Full text link
    The second-order training methods can converge much faster than first-order optimizers in DNN training. This is because the second-order training utilizes the inversion of the second-order information (SOI) matrix to find a more accurate descent direction and step size. However, the huge SOI matrices bring significant computational and memory overheads in the traditional architectures like GPU and CPU. On the other side, the ReRAM-based process-in-memory (PIM) technology is suitable for the second-order training because of the following three reasons: First, PIM's computation happens in memory, which reduces data movement overheads; Second, ReRAM crossbars can compute SOI's inversion in O(1)O\left(1\right) time; Third, if architected properly, ReRAM crossbars can perform matrix inversion and vector-matrix multiplications which are important to the second-order training algorithms. Nevertheless, current ReRAM-based PIM techniques still face a key challenge for accelerating the second-order training. The existing ReRAM-based matrix inversion circuitry can only support 8-bit accuracy matrix inversion and the computational precision is not sufficient for the second-order training that needs at least 16-bit accurate matrix inversion. In this work, we propose a method to achieve high-precision matrix inversion based on a proven 8-bit matrix inversion (INV) circuitry and vector-matrix multiplication (VMM) circuitry. We design \archname{}, a ReRAM-based PIM accelerator architecture for the second-order training. Moreover, we propose a software mapping scheme for \archname{} to further optimize the performance by fusing VMM and INV crossbar. Experiment shows that \archname{} can achieve an average of 115.8×\times/11.4×\times speedup and 41.9×\times/12.8×\timesenergy saving compared to a GPU counterpart and PipeLayer on large-scale DNNs.Comment: 13pages, 13 figure

    Waste Management Strategy for the Nuclear Energy Cycle: Evidence from Coastal Nuclear Power Plants

    No full text

    Exploring the Applicability and Scaling Effects of Satellite-Observed Spring and Autumn Phenology in Complex Terrain Regions Using Four Different Spatial Resolution Products

    No full text
    The information on land surface phenology (LSP) was extracted from remote sensing data in many studies. However, few studies have evaluated the impacts of satellite products with different spatial resolutions on LSP extraction over regions with a heterogeneous topography. To bridge this knowledge gap, this study took the Loess Plateau as an example region and employed four types of satellite data with different spatial resolutions (250, 500, and 1000 m MODIS NDVI during the period 2001–2020 and ~10 km GIMMS3g during the period 1982–2015) to investigate the LSP changes that took place. We used the correlation coefficient (r) and root mean square error (RMSE) to evaluate the performances of various satellite products and further analyzed the applicability of the four satellite products. Our results showed that the MODIS-based start of the growing season (SOS) and end of the growing season (EOS) were highly correlated with the ground-observed data with r values of 0.82 and 0.79, respectively (p r p > 0.05). Spatially, the LSP that was derived from the MODIS products produced more reasonable spatial distributions. The inter-annual averaged MODIS SOS and EOS presented overall advanced and delayed trends during the period 2001–2020, respectively. More than two-thirds of the SOS advances and EOS delays occurred in grasslands, which determined the overall phenological changes across the entire Loess Plateau. However, both inter-annual trends of SOS and EOS derived from the GIMMS3g data were opposite to those seen in the MODIS results. There were no significant differences among the three MODIS datasets (250, 500, and 1000 m) with regard to a bias lower than 2 days, RMSE lower than 1 day, and correlation coefficient greater than 0.95 (p < 0.01). Furthermore, it was found that the phenology that was derived from the data with a 1000 m spatial resolution in the heterogeneous topography regions was feasible. Yet, in forest ecosystems and areas with an accumulated temperature ≥10 °C, the differences in phenological phase between the MODIS products could be amplified
    • …
    corecore